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1 Introduction 


A necessary input to planet occurrence calculations is an accurate model for the pipeline com- 
pleteness (Burke et al.| 2015). This document describes the use of the Kepler planet occurrence 
rate products in order to calculate a per-target detection contour for the measured Data Re- 
lease 25 (DR25) pipeline performance. A per-target detection contour measures for a given 
combination of orbital period, P,,,, and planet radius, R,, what fraction of transit signals are 
recoverable by the Kepler pipeline Jenkins et al.| |2017). The steps for 
calculating a detection contour follow the procedure outlined in (2015), but have 
been updated to provide improved accuracy enabled by the substantially larger database of 
transit injection and recovery tests that were performed on the final version (i.e., SOC 9.3) of 
the Kepler pipeline [2017a). In the following sec- 
tions, we describe the main inputs to the per-target detection contour and provide a worked 
example of the python software released with this document (Kepler Planet Occurrence Rate 
Tools — KeplerPORTs}'| that illustrates the generation of a detection contour in practice. As 
background material for this document and its nomenclature, we recommend the reader be 
familiar with the previous method of calculating a detection contour (Section 2 of 
2015), input parameters relevant for describing the data quantity and quality of Kepler targets 
(Burke & Catanzarite||2017b), and the extensive new transit injection and recovery tests of the 


Kepler pipeline (Christiansen et al.|/2016}|Burke & Catanzarite| |2017a; 2017). 


2 Planet Radius to Multiple Event Statistic 


This section describes the calculation necessary to estimate the primary detection statistic 
employed in the Transiting Planet Search (TPS) algorithm, Multiple Event Statistic (MES) 
(2012), at a point (P,, and R,) of the detection contour. 
This is the first step toward calculating a detection contour, as we must estimate what the MES 
would be for a hypothetical planet for any given Po, and Ry, in the domain of the detection 
contour tailored for the stellar properties of the Kepler target. We employ this MES estimate 
to obtain a completeness estimate based upon the detection efficiency model which is expressed 
in terms of MES (see Section 3). To first order, the MES is proportional to and has the same 
dependence as the basic equation for transit signal-to-noise ratio, 


A 
SNR= oY Neseas (1) 


where A is the transit depth, o is an estimate of the flux time series noise on a time scale 
equivalent to the transit duration, and Ntran is the number of transit events contributing to the 
detection. However, for sufficient accuracy for use in modeling the Kepler pipeline completeness, 
MES formally depends on how well the search template matches the transit signal and the 
time-varying noise estimate measured in TPS (the so-called Combined Differential Photometric 
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Precision (CDPP) time series, |Christiansen et al.|!2012;/Thompson\ |2016). For DR25, the MES 


estimate is given by, 


As=o( Rp, 0,) 
Cmiallins Pars 6,) 


MESeat (Rp, Fo) = M EScor; (2) 


where Aj-~o is the transit depth for an impact parameter, b = 0, planet crossing, 0, is a list of 
the stellar parameters (R,, logg, and Tug) of the target along with limb darkening parameters, 
Jone is the one-sigma depth (OSD) function, and ME'S,or is a scale factor (see below) to achieve 
agreement with the MES values calculated by TPS. 
ie employ A,~9 because it represents the maximum M E'S... possible (for a given R, and 
Ps) —— is a simple algebraic expression for the case of a limb darkened transit ( (Mandel 
[& Agol] 2 see case 10), and it provides a convenient reference point in order to take into 
account the ne of MES values encountered for a given R, and P.» (see Section A) in 
the detection contour modeling. Limb darkening parameters consistent with stellar parameters 
from the DR25 Stellar Catalog are provided in the DR25 Kepler Stellar 
Table for all Kepler targets searched for planets. The KeplerPORTs software provides a python 
implementation for calculating App modeled after Mandel & Agol| (2002). The OSD function 
is described in [Burke & Catanzarite| (201 7b). To summarize, the OSD function quantifies 
the transit signal depth that results in a MES=1 signal for a given transit duration, 7, and 
Py». At fixed P,,4, the one-sigma depth function takes into account effects such as the time 
varying CDPP noise, cadences deweighted during the search, and number of transit events 
while averaging these quantities over orbital phase. The OSD function depends on 6, through 
predicting the expected 7 at a given point in the detection contour. The OSD function values 
are tabulated over a very fine grid of P.y, and the fourteen standard transit durations employed 
in the planet search in TPS (ie., 1.5, 2.0, 2.5, 3.0, 3.5, 4.5, 5.0, 6.0, 7.5, 9.0, 10.5, 12.0, 12.5, 
15.0 hr). KeplerPORTs provides python functions to compute 7 for a given 0x, P,.», and R,, 
as well as code for interpolating the OSD function in order to implement Equation 
We verify the R, to MES model by comparing M E'S¢.¢ from Equation|2}to the expected MES 
(MES¢xp) as measured in the flux-level transit injection (FLTI) tests (Burke & Catanzarite} 
2017a). The MES...) as provided by the FLTI tests are calculated with the TPS algorithm 
itself, thus the FLTI M E'S,,, is the standard we are trying to match with the approach taken in 
Equation|2| For the MES comparison, we selected a subsample of 7199 targets from the available 
32316 targets in the FLTI run having KSOC-5006 as its identifier (Burke & Catanzarite| |2017a). 
From these 7199 targets, 5742, remained after selecting a set of ‘well-behaved’ targets for the 
MES comparison (see Section [3] for a more detailed definition of ‘well-behaved’). We dropped 
from consideration targets with >10% of cadences removed by the multiple planet search. 
Furthermore, we selected transit injections that have 6 < 0.05, transit duration <15 hours, 
pass the window function criteria with sufficient Miran [2017b), and have 
4.0 < MESexp < 100.0. If a target has at least 20 transit injection trials meeting the above 
criteria, we calculate the median ratio of ME'Syatio = M ESest/M ESexp for that target, assuming 
MES.or = 1. Figure |1| shows the resulting ME'S,,4;. as a function of target R,. The median 
MES ratio, MES... = 1.003, and the standard deviation around the median, ¢ = 0.01. Our 
simplified MES,., calculation agrees well with the MES values provided by the TPS algorithm, 
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however, in order to improve the agreement, we apply the MES,., correction factor when 
calculating MES. for the detection contour model using Equation |2| Repeating the analysis 
on other subsamples of targets from the KSOC-5006 FLTI run yielded the same result. 

As more thoroughly discussed in Section the outliers in Figure |1} are correlated with 
parameters associated with suppressed recovery of transit signals in the Kepler pipeline. The 
full database of FLTI results can be employed to develop a higher order R, to MES model 
and provide more accurate results for the outliers shown in the right panel of Figure |1} These 
higher order effects are an insignificant contribution to planet occurrence for the Kepler dwarf 
star target sample. We do not address them with the current R, to MES model. Science efforts 
that require understanding Kepler pipeline completeness for individual targets, especially for 
targets with less well-behaved flux time series properties, may need to provide a higher fidelity 
R, to MES model. 


MES Ratio 
a 
= 


220.6 05 10 15 2.0 2.5 3.0 oro -0.5 0.0 0.5 1.0 
Rstar [Rsun] CDPP Slope Short 


Figure 1: MES ratio between the MES estimated by Equation [2] assuming M ES ¢or = 1 and the 
MES as measured using the TPS algorithm as a function of R,(left panel) and as a function of 
CDPP slope (see Section |3.1) for short transit durations. 


3 Per-Target Detection Efficiency Model 


From previous analysis of the pixel level transit injection (PLTI) studies, (Christiansen et al.| 
noted that the average detection efficiency (DE) performance varied with varying se- 
lections of the target sample. A major emphasis of the DR25 analysis was to quantify this 
target-to-target variation of the DE. To support this effort, we performed a series of flux- 
level transit injection (FLTT) tests and released a database of 121 million transit injection and 
recovery tests that probe the behavior of the TPS module of the Kepler pipeline 


2017a)). [Burke & Catanzarite| (2017a) describe the FLTI output and illustrate the 
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generation of a DE curve for an individual target. The FLTI tests consist of a set of ‘deep’ FLTI 
runs with ~10° injections for a representative set of ~100 targets supplemented by ‘shallow’ 
FLTI runs with ~10° injections on a much larger set of ~10* targets. The deep and shallow 
runs complement each other for the development and validation of the per-target DE model 
and they help identify external variables that correlate with variations in the DE. The DE is a 
critical input for the detection contour, and previous planet occurrence rate calculations (Burke 
employed the average DE derived from PLTT studies 2015). 
The average DE derived from the PLTI study is still valid for occurrence rates, especially those 
involving large samples of targets. However, occurrence rate studies involving small subsets 
of Kepler targets should use the DE model described herein as it more accurately captures 
target-by-target variations. 


3.1 Characterizing Stellar Noise Using CDPP Slope 


Through trial and error, we determined that the external parameters that correlate most 
strongly with the empirical per-target DE variations measured from the FLTI tests are stellar 
radius (R,), CDPP slope, and orbital period (P,,,). CDPP slope values are new to DR25 oc- 
currence rate products, and we introduce them in this section before describing the DE model. 
CDPP slope values are tabulated for all targets as part of the DR25 Kepler Stellar Table 
[2017b). In order to calculate CDPP slope values, we begin with rmsCDPP val- 
ues for the fourteen transit durations employed in the search by the TPS algorithm (‘Thompson} 
2016). Under the assumption of white Gaussian noise in the flux time series, the rmsCDPP 
values decrease toward longer transit durations roughly as square root of the number of ca- 
dences in-transit. We quantify how well a target follows this expectation of white Gaussian 
noise by measuring the slope of rmsCDPP as a function of transit duration; a negative CDPP 
slope indicates that the flux time series satisfies the white Gaussian noise assumption. To cap- 
ture the behavior in rmsCDPP trends with transit duration exhibited by Kepler targets, we 
calculate a CDPP slope for the short transit durations (2< 7 <4.5 hr), slpCDPP,, and long 
transit durations (7.5< 7 <15 hr), slpCDPP). Figure [2] shows a scatter plot of CDPP slope 
(short versus long) for the targets studied as part of the FLTI test. Targets in the lower left 
corner of Figure |2| (negative CDPP slope values) empirically have a DE response closest to 
the theoretical expectation indicative of well-behaved flux time series data free of astrophysical 
noise. Targets that depart from the ideal white Gaussian noise properties have higher CDPP 
slope in both dimensions, and have suppressed DE curves. We define the lower left region of 
the CDPP slope plane (slpCDPP) < —0.1 and slpCDPP, < —0.4) as having well-behaved noise 
properties because the corrections for a suppressed DE inside this region are smaller than 5%. 
A majority (85%) of Kepler dwarf (R,<1.1 Ro) targets are in the well-behaved region of the 
CDPP slope plane and have the highest sensitivity DE curves. 

Qualitatively targets with short, hourly time scale astrophysical variability populate the 
lower diagonal population that branches off from the densely populated area at slpCDPP; = 
—0.4 in Figure|2} Subgiants and giants generally populate the region toward higher slpCDPP; 
and the upper right corner of the CDPP slope plane. It may be fruitful to investigate whether 
the information in the CDPP slope plane can be employed as a stellar gravity indicator in 


KSCI-19111-002: Detection Contour June 13, 2017 


Reduced 7 


4.5 
4.0 
3.5 
3.0 
2.5 
2.0 
1.5 
1.0 


0.07590 05 10 7° 


slpCDPP, 


Figure 2: Distribution of the Kepler target noise properties in terms of slpCDPP, versus 
slpCDPP, (small points) for the subsample of targets with FLTI results. For purposes of mod- 
eling the DE, this CDPP slope plane is parceled into bins (large points represent bin centers) 
and corrections to the DE model are applied at these bin locations. The shading of the large 
points represents the reduced y? between the final DE model and empirical DE measurements 
from the FLTI tests. The reduced y? calculation is described in the text, but locations in the 
CDPP slope plane with a white to red color shading are locations where locally more than 10% 
of the targets have empirical DE that significantly disagree with the final DE model. 


slpCDPP, 
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similar fashion to the ‘flicker’ noise metric of (2016). Only targets chosen for the 


FLTI tests are shown. A subsample of these targets were selected to populate the CDPP slope 
plane uniformly, while others were selected randomly. Thus, the relative number of objects in 
Figure 2]in various regions is not representative of the full Kepler target sample. 


3.2 Stellar Radius Detection Efficiency Dependence 


The per-target DE model begins with a tabulation of the R, dependence. In order to capture 
the average R, dependence of the DE curve independent of other effects, we select targets 
from the shallow-run FLTI tests (FLTI run identifiers KSOC-5006 and KSOC-5104, 
that are in the well-behaved region of the CDPP slope plane. After quality 
cuts, 19099 targets had sufficient injections for inclusion in the analysis. In bins of 0.1 Ro, 
we calculate the median DE value as a function of MES within bins of AMES=1. We refer 
to these stellar radius dependent DE curves as the ‘base DE’ model, and they are shown in 
Figure |3] for select stellar radius bins. On average, the detection efficiency is suppressed for 
smaller stars (which, in this case, are the late type dwarf targets). This suppressed recovery 
derives predominately from the mismatch between the limb darkening profile for cool stars and 
the astrophysical template for transiting planets, which was designed to match the G dwarf 
targets in the Kepler sample [2015). The tabulated base DE model is included as 
part of the KeplerPORTs software release as the file detectEffData_alpha_base_02272017.txt in 
ASCII format. The columns of the file are as follows: MES bin center, R, bin center, DE value. 
The KeplerPORTs software provides python code for reading and implementing the base DE 
model. 


ae: Rstar [RO] 
> : AEeeccceso888s ce tee’, pee 1.20 
eal 1.05 
E 0.8 ae 1.05 
u a8 0.90 
= 0.6 
wi ne 0.75 
60. 0.60% 0.60 
o 0.45 0.45 
2 0.2 a 
a” 0.30 © -0.20 aoe 
0.155 0.15 


MES 


Figure 3: Base DE dependence on stellar radius according to the color shading (left panel) and 
the same data, but relative to the R,=1.2 Ro DE (right panel). 
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3.3. CDPP Slope Detection Efficiency Model Correction 


The next step is to extend the applicability of the base DE model beyond the well-behaved 
CDPP slope region. We determine a correction table to the base DE model on a grid across the 
CDPP slope plane shown in Figure[2| DE curves from the same shallow FLTI run as before (i.e., 
KSOC-5006, 5104) are examined, but now targets for the full CDPP slope plane are included. 
To determine the correction, we first compute the base DE model for each FLTI target by 
interpolating the grid of base DE model curves calculated in Section[3.2|using the appropriate R, 
for each target. Then the difference between the base DE model and the empirically measured 
FLTI DE curve is calculated for all targets. The median of the base and empirical DE differences 
is measured in bins over the two dimensional CDPP slope plane. The large circles in Figure 
show the bin centers over which the CDPP slope plane is parceled for determining the median 
DE offset. The DE offset is further parceled along the MES direction in bins of AMES=1, thus 
effectively making the CDPP slope plane DE correction into a three dimensional correction 
table. The tabulated CDPP slope plane DE model corrections are included as part of the 
KeplerPORTs software release as the file detectEffData_alphal2_SlopeLongShort_02272017.txt 
in ASCII format. The columns of the file are as follows: MES bin center, slpCDPP, bin center, 
slpCDPP, bin center, DE correction value. The KeplerPORTs software provides python code 
for reading and implementing the CDPP slope DE model corrections. 

Figure |4]shows the CDPP slope plane DE correction as a function of MES for two values of 
slpCDPP, and lines of slpCDPP, given by the color shading. The left panel shows results for 
slpCDPP, = —0.34, and the right panel shows results for slpCDPP; = 0.19. The DE corrections 
in the CDPP slope plane indicate that targets with enhanced red noise present in the flux time 
series (high CDPP slope values) have reduced sensitivity for recovering planet signals. The 
DE curve represents the combined response of the vetoes implemented in the TPS algorithm 
(2016). The mechanisms responsible for the vetoes having similar DE response 
for targets with similar CDPP slopes are not fully understood, but the CDPP slopes empirically 
explain the DE variations observed in Kepler targets. 


3.4 Orbital Period Detection Efficiency Model Dependence 


The final step for the DE model accounts for dependence on Pj. There are not enough 
injections performed on each shallow FLTI target to quantify the period dependence. Thus, 
we employ the deep-run FLTI targets (~600,000 injections per target as described in 
[2017a). We investigate the detection efficiency in five period bins bounded by 
10, 60, 100, 200, 400, and 700 days. For each deep FLTI target, we calculate five empirical 
DE curves, one for each period bin. We also compute the base DE model for each target from 
Section Both the empirical DE curve and the base DE model use a MES bin resolution 
of AMES=0.33. The base DE model is subtracted from the five period dependent empirical 
DE curves, yielding the residual DE offsets for the five period bins. We model the residual DE 
offset in each period bin independently using a linear function of the form, 


DEvesia = Co + C1 X Nip + C2 X (CDPPSlopePlaneCorrection), (3) 
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Figure 4: Corrections to the DE for fixed slpCDPP, = —0.34 (left panel) and slpCDPP, = 0.19 
(right panel) for curves of increasing slpCDPP, from red to blue shading. 


where J;, is the expected number of transits at the midpoint of the period range (which depends 
on the data span and duty cycle for the star in question) and (CDPP Slope Plane Correction) 
is the interpolated correction determined in Section In this linear model, we assume that 
the ‘shape’/features of the CDPP slope plane correction does not depend on period other than 
a scaling factor, co. For each of the five period ranges and each MES bin, the coefficients 
that minimized the difference between the measured residual offset and the linear model are 
computed. 

Not all three terms are warranted over the full (P..1, MES) space. This is especially true 
where the DE is close to zero or unity. The constant, co, is used over the full parameter range. 
For MES<4.25 and MES>13.25 in the shortest period bin and for MES>20 in the longest 
period bin, both c, and c2 are set to zero. The second term, c;, is set to zero for all but 
the two shortest period bins. The third term, co, is set to zero when the absolute value of 
the CDPP slope plane correction (see Figure |4| for example corrections) interpolated to the 
MES bin center and target’s CDPP slope values is less than 0.003 when averaged over targets. 
These coefficients are stored in the HDF5 format file detectEffData_alphal2_02272017.h5, and 
the KeplerPORTs software provides an example of their extraction and implementation for the 
per-target DE model. 


3.5 Efficacy of the Per-target Detection Efficiency Model 


Overall, the per-target DE model requires six inputs: R,, slpCDPP), slpCDPP,, duty cycle, 
data span, and Po. The KeplerPORTs software provides an implementation in python that 
reads in the requisite data tables and the necessary interpolations to yield a DE curve for the 
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given set of inputs. Figure |5| shows the resulting DE model curves for the five period bins 
employed in the per-target DE model for a target in the well-behaved CDPP slope plane region 
(left panel) and for a target in the suppressed DE region of the CDPP slope plane (right panel). 
The DE is lower and has a stronger period dependence in the presence of red noise in the flux 
time series. 


R,: 1.0; slpCDPP;: 0.08; slpCDPP,: 0.87 


10-60 [day] 
60-100 
100-200 
200-400 
400-700 


10. 15 20 25 7 20.25 
MES MES 


Figure 5: Example per-target DE model for the five period bins (i.e., blue, green, red, cyan, 
and magenta) with increasing P,,, for the well-behaved CDPP slope plane region (left panel) 
and for the upper right of the CDPP slope plane (right panel). 


Figure |2| shows the quality of the per-target DE model for targets across the CDPP slope 
plane. The large points are color shaded by the reduced y? between the DE model and empirical 
DE from the shallow-run FLTI targets. For a representative value of reduced y? in each grid 
region of the CDPP slope plane, we adopt the upper 90‘ percentile of reduced y? values. 
The reduced y? is calculated in the transition region (5<MES<12) of the DE. In calculating 
the uncertainty in the empirical DE FLTI measurement for use in y?, we employ the analytic 
Bayesian parameter estimation model using a binomial model likelihood and an uninformative 
beta distribution (a = 0.5, 8 = 0.5) prior. This idealized uncertainty model tends to under 
predict the per bin MES scatter, thus we remove the largest outlier before calculating reduced 
y? resulting in seven degrees of freedom for the seven MES bins that contribute to the x? 
calculation. A reduced x? ~ 2.5 with seven degrees of freedom represents a 3-c residual from 
the model. Thus, locations in the CDPP slope plane with a white to red color shading are 
locations where locally more than 10% of the targets have empirical DE that significantly 
disagree with the DE model. 
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3.6 Limitations of the Per-target Detection Efficiency Model 


The per-target DE model is not applicable to all Kepler targets. Several categories of targets 
strongly deviate from the model. First, the DE model is not applicable for targets with transit 
durations >15 hr, where FLTI tests reveal a strong suppression of transit signals. This occurred 
because the TPS algorithm was run with a maximum search transit duration of 15 hr, which also 
determined the time scale over which the whitening filter in TPS aims to protect transit signals. 
Thus, transit signals on time scales longer than 15 hr are removed by the whitening filter and 
FLTI tests show that transit signals with longer transit durations have highly suppressed DE 
curves. The 15 hr limitation is suitable for accommodating dwarf star, R,< 1.25 
Ro targets with modest eccentricities. However, researchers wanting to perform 
occurrence rate studies on subgiant and giant Kepler targets will need to explore 
the full database of FLTI information and derive a more complete DE model. In 
solving for the current per-target DE model, all transit injections that resulted in a transit 
duration >15 hr were removed from consideration. 

The second limitation is for targets which have significant amounts of in-transit data re- 
moved in subsequent searches for additional transiting planets. The Kepler occurrence rate 
data products are designed to quantify pipeline completeness under the assumption that the 
target is planet free. Thus, the input to FLTI tests and noise estimates for occurrence rate 
purposes are performed on a flux time series where transits identified in the DR25 pipeline 
run are effectively removed by replacement and fully deweighting the cadences so as not to 
contribute to the planet search or noise statistics. Examination of the FLTI results finds that 
targets with >10% of their data removed from previously identified transits have suppressed 
DE curves. The population of targets with >10% of data removed with P,,,>10 days is small 
(5% of Kepler targets and 10% of Kepler targets with planet candidate KOIs). Thus it is safe 
to ignore this effect for planet occurrence rates Z 10 days. However, for investigations of multi- 
planet systems with a short period (P,,,<3 day) planet, the DE model is likely optimistic after 
cadences from the first identified planet are removed. In solving for the per-target DE model, 
we adopted a more conservative threshold and do not consider FLTI injections on targets that 
experienced >5% fractional decrease in the duty cycle from removal of planet signal cadences. 
The fractional duty cycle drop during the DR25 pipeline run can be obtained for all Kepler 
targets in the DR25 Stellar Table by comparing the tabulated values of the duty cycle calcu- 
lated before and after the pipeline has identified all transit events (i.e., utilize the parameters 
“dutycycle” and “dutycycle_post” in Appendix A of [Burke & Catanzarite (2017b)). 

Finally, we examined targets that are not well-fit by the final DE model. We identify two 
characteristics of their flux time series that are common to the targets not well modeled. We do 
not attempt to model the impact on the DE using these additional characteristics. However, we 
qualitatively describe the behavior and provide a list of targets that are not well characterized by 
the per-target DE model. The first notable characteristic regards flux brightening events that 
occur on the transit timescale. In the presence of such features, the brightening systematic 
offsets/fills-in the dimming transit signal and prevents its recovery. We developed a metric 
that quantifies targets influenced by systematic brightening features. We employ diagnostic 
information from the TPS algorithm that provides an estimate of the depth of a purported 
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transit signal centered on every cadence. From the depth estimate time series, we formulate 
a metric that identifies targets with brightening systematics that cause a heavy tail in the 
distribution of depth estimates relative to the Gaussian distribution. 

The second metric examines the properties of the wavelet based whitening filter employed 
in TPS. In the presence of strong astrophysical noise, the whitening filter can have a higher 
amplitude of signal suppression and be more aggressive at detrending the data. The more 
aggressive the whitening filter, the greater the tendency for transforming a u-shaped transit 
signal into a signal concentrated exclusively at the ingress and egress cadences and suppressing 
the transit depth. We developed a metric that quantifies this shape change, as anecdotal 
evidence shows that when the filtered transit signal shape is concentrated into a small number 
of ingress and egress cadences, the veto algorithms in TPS are more susceptible to stochastic 
and systematic variations of single cadences. Both these metrics are calculated on the flux 
time series data after planet signals have been removed by the planet search. The external file, 
DR25_DEModel_NoisyTargetList.txt, lists the Kepler ID for 8629 targets that have elevated 
values of these two metrics and empirically are outliers from the DE model. This list contains 
1.2% of the GKM Kepler dwarf star targets. 

In summary, the per-target DE model should not be used for targets with 
R,>1.25 Ro, targets with an expected transit duration >15 hr, multi-planet systems 
with fractional duty cycle drops >10%, and the targets reported to have deviant 
flux time-series (i.e., see DR25 DEModel NoisyTargetList.txt). 


4 MES Smearing 


The next stage of calculating a detection contour involves mapping the MES estimate of a 
hypothetical planet with a given Py and R, (Section |2) to a recovery fraction via the DE 
model. Section [3| describes a per-target DE model new to DR25 that characterizes the recovery 
fraction as a function of MES. One can also adopt the target averaged DE model based upon 


the PLTI test (Christiansen} |2017). However, before employing any DE model for a detection 


contour calculation, one additional conditioning step is necessary to achieve agreement between 
the FLTI tests (Burke & Catanzarite| and the detection contour models described in 
(2015). The model deficiency occurs in the process of converting MES to detection 
probability using the DE model alone. A more accurate conditioning of the DE model is needed 
that properly takes into account the distribution of impact parameters which we call the ‘MES 
smearing effect’. Specifically, for a given point [Porn, Rp] in the detection contour, the impact 
parameter acts as a third dimension which influences the recoverability of a transit signal. MES 
smearing is a method to properly marginalize over the third dimension of impact parameter 
for transit signal recoverability when modeling a detection contour, rather than the previous 
method of treating the impact parameter distribution inaccurately with a point estimate. 

For a given Ry, MES is maximum for impact parameter, b=0, but MES decays toward 
zero for higher impact parameters due to limb darkening effects and eventually due to grazing 
geometry. The previous procedure for estimating the recovery fraction at the average impact 
parameter is inaccurate, since this point estimate of the detection probability does not take 
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into account the asymmetric high impact parameter tail towards low MES. The left panel of 
Figure (6| shows an example target where we have plotted the MES as a function of impact 
parameter over a small region of P,,, and R, from the FLTI output. The MES in Figure 
has been normalized to the median MES for impact parameters near zero (b < 0.05). The 
right panel of Figure [6] shows data similar to the left panel, but displayed as a histogram of 
the normalized MES values built from the average of 50 trial [P.., Rp] locations (blue points). 
We find that MES as a function of impact parameter does not vary with P,,, or R, for a given 
target. The normalized MES distribution varies between zero and one, thus a beta distribution 
probability describes the MES distribution well. The green line shows a beta distribution fit to 
the empirical MES distribution. The sudden drop off at MES<0.2 is an artifact of only injecting 
transits with b < 1 rather than injecting up to the full grazing transit with b = 1+-R,/R,. 


Relative Counts 


0 02 04 06 08 1.0 "§@-O3~ 04 06 08 
Impact Parameter Normalized MES 


Figure 6: Scaled MES as a function of impact parameter for a small region of Po, and R, from 
the FLTI output (left panel). Resulting distribution of MES values built from 50 trial (Pon, 
R,) locations (blue points) and a beta distribution fit to the MES distribution (green line). 


In practice, MES smearing is applied to the DE model from Section ] using a Monte-Carlo 
integration method. The effective, smeared DE calculation is performed for every MES bin 
of the original DE model. In this MES smeared representation of the DE model, the MES 
of the bin represents the highest MES, b = 0, transit signal. MES smearing is calculated by 
sampling from the beta probability distribution and converting the beta distribution samples, 
which range from 0-1 into MES scaled by the current MES bin value. The beta-distributed 
scaled MES values are converted to recovery fractions from the original DE curve. The resulting 
recovery fraction sample is averaged to provide the final ‘smeared’ DE curve. Figure |7| shows 
a model DE from Section [3] (dash line) and the same model after applying the MES smearing 
distribution (solid line). 

We have found from FLTI tests that the MES distribution as a function of impact parameter 
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is not the same for all targets. For targets with higher levels of astrophysical variability, and/or 
red noise, the whitening filter in TPS is more aggressive and the MES distribution with impact 
parameter is flatter. Empirically, we find that the slpCDPP, is a good proxy to identify targets 
with flattened MES distributions. When modeling the MES distribution for the FLTI targets, 
we find a quadratic function in slpCDPP, describes the beta distribution parameters. 


a = 5.324687 x CSL? + 3.622976 x CSL + 2.753682 (4) 
b= — 0.288250 x CSL? — 0.118463 x CSL + 0.450678, (5) 


where a and 6 are the two parameters of the beta distribution and C'SL=slpCDPP). From 
the deep-run FLTI targets (FLTI run identifiers KSOC-5004, KSOC-5008, and KSOC-5125 
from [2017a), we fit the beta distribution parameters to a sample of 72 
targets after quality control cuts. See the KeplerPORTs software for the implementation of 
MES smearing for building a detection contour. 


5 Window Function 


The window function specifies the fraction of phase space as a function of Po, such that the 
minimum detection requirements (such as having at least three transit events) are met. See 
[Burke & MeCullough (2014); Burke etal (2015); [Burke & Catanzarite] (207TH) for a deseription 
of the full details of the window function. = ae ean of the window function in a detection 
contour has not changed since [Burke et al.] (2015 2015) with the exception that a detailed numerical 


model of the Kepler window function is now sahlable (Burke & Catanzarite} |2017b [2017b). 


6 Detection Contour 


Construction of a detection contour for Kepler pipeline Oe a and its application for 
planet occurrence rate analysis is described in eo. The DR25 detection con- 
tour model follows the general procedure of Tp but updates the inputs for 
much improved accuracy and precision. The previous a te each of the steps for 
constructing a detection contour, and for the convenience of the end user, we are releasing 
the DR25 KeplerPORTs python module that provides code for all the steps discussed in this 
document and illustrates the calculation of a detection contour for a single target. The inputs 
to KeplerPORTs are the window function and one-sigma depth function data Nc avail- 
as for download at the NASA Exoplanet Archive and documented in 
(2017b), data tables for the per-target detection efficiency model (released as seen: files 
ae the KeplerPORTs software), and a set of input values (R,, slpCDPP), slpCDPP,, duty cy- 
cle, data span, limb ean ee coefficients) available from the DR25 Kepler stellar data table’s 
occurrence rate columns (Burke & Catanzarite| Burke & Catanzarite] |2017alb). 

As an example use of CIGD ORTS to generate a single target detection contour, Figure 
shows the resulting detection contour for the target KIC 3429335. For comparison, we show 
an empirical detection contour derived from the FLT test in Figure |9| The difference between 
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Figure 7: DE model curve from Section | (dash line) and after MES smearing has been applied 


(solid line) for the properties of KIC 8311864, host to Kepler-452b (Jenkins et al.} |2015). 
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the model and empirical detection contour is shown in Figure [10] which demonstrates the good 
agreement between the two. 


7 Epilogue 


The detection contour model described in this document and implemented by the KeplerPORTs 
software only quantifies the recoverability of transiting planet signals due to the Kepler pipeline. 
In other words, it accurately portrays the ability of the Kepler pipeline to generate a Threshold 


Crossing Event (TCE) for a given hypothetical planet (Twicken et al., |2016} |Jenkins et al.) 


(2017). However, the subsequent classification steps that turn TCEs into Kepler Objects of 
Interest (KOI) and then disposition them Thompson et al., in prepara- 
tion) as planet candidates (PCs) or false positives (FPs) is not accounted for by these detection 
contours. Thus, for accurate planet occurrence rate calculations using the Kepler PC popula- 
tion, one must augment the detection contours for the completeness of these classification steps 
[2015). The effective completeness of the classification steps can take the form of 
an additional DE curve that multiplies the per-target Kepler pipeline DE model of Section 
The raw data to derive the classification DE curve is described in{Coughlin| and available 
through the NASA Exoplanet Archive. In addition to completeness, any detection experiment 
must quantify the potential for false alarms (or type I errors) contaminating the PC sample 
(referred to as planet sample reliability). In DR25, we introduced several tests that quantify 
the level of false alarm contamination Thompson et al., in preparation). The 
completeness of the classification steps and planet sample reliability have not been included 
in any Kepler planet occurrence rate calculations to date. A major advancement for the final 
DR25 Kepler planet catalog is that planet occurrence rates can finally take into account these 


important contributions that ultimately shape the detected planet sample (Burke et al.) /2015) 


Thompson et al., in preparation). 
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Figure 8: Detection contour model for KIC 3429335 using the KeplerPORTs software and the 
inputs described in this document. 
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Figure 9: Empirical detection contour for KIC 3429335 using the FLTI output data set for this 
target. 
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Figure 10: Difference between the detection contour model (Figure |g) and the empirical detec- 
tion contour (Figure (9) for KIC 3429335. 


23 


KSCI-19111-002: Detection Contour June 13, 2017 


References 

Bastien, F. A., Stassun, K. G., Basri, G., & Pepper, J. 2016, ApJ, 818, 43 
Burke, C. J., & McCullough, P. R. 2014, ApJ, 792, 79 

Burke, C. J., Christiansen, J. L., Mullally, F., et al. 2015, ApJ, 809, 8 


Burke, C. J., Catanzarite, J. 2017a, Planet Detection Metrics: Per-Target Flux-Level Transit 
Injection Tests of TPS for Data Release 25 (KSCI-19109-001) 


Burke, C. J., Catanzarite, J. 2017b, Planet Detection Metrics: Window and One-Sigma Depth 
Functions for Data Release 25 (KSCI-19101-002) 


Christiansen, J. L., Jenkins, J. M., Caldwell, D. A., et al. 2012, PASP, 124, 1279 
Christiansen, J. L., Clarke, B. D., Burke, C. J., et al. 2015, ApJ, 810, 95 
Christiansen, J. L., Clarke, B. D., Burke, C. J., et al. 2016, ApJ, 828, 99 


Christiansen, J. L. 2017, Planet Detection Metrics: Pixel-Level Transit Injection Tests of 
Pipeline Detection Efficiency for Data Release 25 (KSCI-19110-001) 


Coughlin, J. L., Mullally, F., Thompson, S. E., et al. 2016, ApJS, 224, 12 


Coughlin, J. L. 2017, Planet Detection Metrics: Robovetter Completeness and Effectiveness for 
Data Release 25 (KSCI-19114-001) 


Jenkins, J. M. 2002, ApJ, 575, 493 

Jenkins, J. M., Twicken, J. D., Batalha, N. M., et al. 2015, AJ, 150, 56 

Jenkins, J. M. 2017, Kepler Mission Data Processing Handbook (KSCI-19081-002) 
Mandel, K., & Agol, E. 2002, ApJ, 580, L171 

Mathur, S., Huber, D., Batalha, N. M., et al. 2017, ApJS, 229, 30 

Seader, S., Jenkins, J. M., Tenenbaum, P., et al. 2015, ApJS, 217, 18 


Thompson, S. E. 2016, Data Validation Time Series File: Description of the File Format and 
Content, KSCI-19079-001 


Twicken, J. D., Jenkins, J. M., Seader, S. E., et al. 2016, AJ, 152, 158 


24 


